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This  report  details  the  research  and  development  work  done  on  MCC-^+  under  ONR 
grant  N00014-95-T0669. 

1  Overview  of  A^£C++ 

AiCC++  is  a  Machine  Learning  library  of  C++  classes.  General  information  about  the  library, 
including  source  code,  can  be  obtained  through  the  World  Wide  Web  at  URL 

http : / / robotics . Stanford . edu : / users/ronnyk/ mlc . html  . 

Over  350  different  sites  have  copied  the  MCC++  kit,  and  machine  learning  research  in 
the  robotics  lab  at  Stanford  is  enhanced  through  the  use  of  the  library. 

2  Summary  of  Results 

As  detailed  in  the  statement  of  work  for  the  grant,  three  main  projects  were  proposed: 

1.  Hybrid  decision  tree  and  nearest-neighbor. 

2.  Stacking  and  Bagging. 

3.  Oblivious  decision  graphs. 

We  now  describe  the  specific  work  done  and  the  results  obtained. 
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2.1  Hybrid  Decision  Tree  and  Nearest-Neighbor 

The  proposal  called  for  implementation  of  a  hybrid  approach  that  uses  decision  trees  for  the 
nominal  features  and  uses  nearest-neighbor  algorithms  at  the  leaves. 

The  algorithm  was  implemented  in  .V(£C++,  and  its  accuracy  outperformed  both  decision 
tree  algorithms  and  nearest-neighbor  approaches  on  some  artificial  datasets  that  originally 
motivated  this  idea.  However,  performance  on  real  datasets  from  the  UC  Irvine  repository 
(Murphy  &  Aha  1995)  did  not  improve  over  these  approaches  in  many  cases  and  degraded 
in  some. 

Our  analysis  revealed  that  while  the  decision  tree  algorithm  provides  a  decomposition  of 
the  problem,  the  main  problem  is  that  the  training  set  is  fragmented  because  of  the  decision 
tree  splits  and  the  space  for  the  nearest-neighbor  algorithm  becomes  sparse.  The  tradeoff 
between  the  extra  representation  power  and  fewer  instances  per  node  does  not  seem  useful 
in  many  datasets. 

We  are  currently  experimenting  with  other  hybrid  approaches  that  involve  parametric 
algorithms,  such  as  Naive-Bayes  (Langley,  Iba  &  Thompson  1992,  Duda  &:  Hart  1973),  which 
(because  of  their  limited  representation  power)  do  not  suffer  from  the  curse  of  dimensionality 
in  high  dimensional  spaces,  and  might  therefore  be  more  suitable  for  this  approach. 

2.2  Stacking  and  Bagging 

Stacking  (Wolpert  1992),  sometimes  called  Bagging  (Breiman  1994),  averaging  (Perrone 
1993)  or  ensembles  (Krogh  k  Vedelsby  1995),  utilize  multiple  classifiers  that  “vote”  on  the 
predicted  class. 

The  voting  algorithm  has  been  implemented  in  A4£C++,  thus  providing  the  ability  to 
wrap  around  and  aggregate  any  existing  algorithm  or  new  algorithm  that  is  implemented  in 
jkiCC++. 

The  algorithm  was  used  in  demonstrating  the  bias  and  variance  for  zero-one  loss  functions 
(Kohavi  k  Wolpert  1996).  Approaches  such  as  nearest-neighbor,  which  have  a  bias  problem 
in  high-dimensions  will  not  improve.  However,  improvements  can  be  seen  with  methods  that 
suffer  from  high  variance,  such  as  decision  tree  algorithms,  or  when  nonconvergent  methods 
are  used  (Finnoff,  Hergert  k  Zimmermann  1993). 

2.3  Oblivious  Decision  Graph 

Oblivious  decision  graphs  provide  a  hypothesis  space  that  is  easy  to  understand,  yet  does  not 
suffer  from  some  of  the  shortcomings  of  decision  trees.  The  original  work  (Kohavi  1994)  that 
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was  supported  by  a  previous  ONR  grant  was  extended  to  deal  with  continuous  attributes 
through  discretization. 

The  results  were  published  in  Kohavi’s  dissertation  with  graphs  depicting  some  target 
concepts  from  the  UCI  database  (Kohavi  1995,  Chapter  6).  The  resulting  graphs  are  much 
easier  to  comprehend  in  most  cases  than  the  equivalent  decision  trees. 


3  Summary 

We  have  implemented  the  three  proposed  algorithms:  hybrid  approach,  bagging,  and  obliv¬ 
ious  decision  graphs. 

The  hybrid  approach  led  to  some  negative  results  and  we  are  considering  alternative 
approaches  based  on  the  same  idea.  The  implementation  of  bagging  now  provides  any 
algorithm  implemented  in  M.CC++  with  a  wrapper  that  might  improve  its  performance, 
especially  if  the  algorithm  suffers  from  instability,  as  do  decision  trees.  The  implementation 
of  oblivious  decision  graphs  was  improved  with  good  results  that  were  shown  in  Kohavi’s 
dissertation. 

Copies  of  papers  based  on  this  research  are  available  on  the  web  (as  noted  in  the  Refer¬ 
ences),  and  hard  copies  can  be  requested  from  the  author. 
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